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ABSTRACT 

The Cox protein from bacteriophage P2 is a small 
multifunctional DNA-binding protein. It is involved in 
site-specific recombination leading to P2 prophage 
excision and functions as a transcriptional repres- 
sor of the P2 Pc promoter. Furthermore, it transcrip- 
tionally activates the unrelated, defective prophage 
P4 that depends on phage P2 late gene products for 
lytic growth. In this article, we have investigated the 
structural determinants to understand how P2 Cox 
performs these different functions. We have solved 
the structure of P2 Cox to 2.4 A resolution. 
Interestingly, P2 Cox crystallized in a continuous 
oligomeric spiral with its DNA-binding helix and 
wing positioned outwards. The extended 
C-terminal part of P2 Cox is largely responsible 
for the oligomerization in the structure. The 
spacing between the repeating DNA-binding 
elements along the helical P2 Cox filament is 
consistent with DNA binding along the filament. 
Functional analyses of alanine mutants in P2 
Cox argue for the importance of key residues 
for protein function. We here present the first struc- 
ture from the Cox protein family and, together 
with previous biochemical observations, propose 
that P2 Cox achieves its various functions by 
specific binding of DNA while wrapping the DNA 
around its helical oligomer. 



INTRODUCTION 

Bacteriophage P2 is a iion-lambdoid pliage tliat can infect 
several enteric bacterial species (1). It is the prototype of 
the P2-hke family of phages found in gamma- 
proteobacteria, and in natural Escherichia coli isolates 
almost 30% contain a P2-like prophage (2). After infec- 
tion, P2 can either form lysogeny or grow lytically, which 
ultimately leads to cell lysis and the release of progeny 
phage particles. In the former case, the infected cells 
survive and the phage's DNA gets integrated into the 
chromosome of its host organism. The protein P2 Cox 
(control of excision), a small 91 amino acids long multi- 
functional protein, has previously been shown to have an 
intrinsic role in these processes. P2 Cox plays a role in P2 
prophage excision, transcriptional repression of the P2 Pc 
promoter and in transcriptional activation of the satellite 
phage P4 Pll promoter. Phage P4 is dependent on struc- 
tural genes and lysis functions of phage P2 for lytic growth 
[for a review, see (3)]. 

The lysogenic state of temperate phages is controlled by 
a regulatory protein, the immunity repressor. In phage P2, 
this task is performed by the C protein, which binds to the 
operator region of the P2 early operon containing the cox 
gene and thereby repressing the expression of this operon. 
The C protein is transcribed in the opposite direction to 
the early operon and contains a helix-turn-helix (HTH) 
motif that binds direct repeats in the operator region (4). 
The region between these coding regions is ~110 base 
pairs and contains the converging promoters of the early 
operon Pe, and the repressor Pc (5-7). The product of the 
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cox gene on the other hand inhibits the synthesis of 
the immunity repressor C. By binding to the Pc 
promoter, P2 Cox acts as a repressor of the lysogenic 
operon of the phage, controlhng the expression of the 
immunity repressor and the integrase (8). In this aspect, 
it is thus functionally equivalent to the Cro protein of 
phage X (9). 

Early findings showed that P2 cox defective prophages 
were unable to produce free phages spontaneously during 
growth (10). In site-specific recombination, P2 Cox func- 
tions as a directionality factor by preventing integrative 
recombination and promoting excisive recombination 
(11). In this sense it is functionally equivalent to bacterio- 
phage k exisionase ('^Xis) (12-14). 

The satellite phage P4 carries the genes for transcrip- 
tional control, DNA replication and lysogenization, but it 
cannot perform lytic growth without structural genes and 
lysis machinery from a helper, e.g. the P2 phage (15). The 
derepression of phage P4 lysogens requires an intact P2 
cox gene, and the binding site for P2 Cox has been located 
to a region upstream of the P4 late promoter, Pll (16). 

The native form of the P2 Cox protein in solution is 
multimeric (17), and upon binding to its target it 
protects a region of at least 70 nucleotides from DNAse 
I cleavage (11,16). A comparison of the protected regions 
shows a consensus sequence TTAAAG/CNCA, denoted 
cox-boxes, which are present in at least six copies in the 
binding sites. Interestingly, the direction of the cox-boxes 
varies within the different targets and binding induces a 
strong bend in the DNA (18). 

The crystal structures of the A Cro and Xis proteins 
have been determined (9,19). The 66 amino acids long 
■^Cro protein is a typical example of a HTH DNA- 
binding protein that binds to its 17 bases long operator 
sequence as a dimer and bends the DNA (12,13,20,21). 
The 77 amino acids long '^Xis protein forms a winged 
HTH motif with an unstructured C-terminal tail (19,22). 
The crystal structure of a truncated version of '^Xis, 
termed ^'^^^Xis, lacking the last 27 residues, has been 
determined together with its 33 nucleotides long DNA 
target, indicating that ^Xis forms a nucleoprotein 
filament with its DNA target (20,23). There is no or very 
low sequence similarities between the P2 Cox and A Cro 
(9%) or Xis (20yo) proteins, but a secondary prediction 
indicates that P2 Cox belongs to the winged HTH family 
of DNA-binding proteins (18,24,14). 

In this article, we have determined the structure of full- 
length P2 Cox protein, which constitutes the first structure 
from the Cox protein family. The structure reveals that P2 
Cox indeed has a winged HTH structure and, despite low 
sequence similarities, is structurally closely related to ^Xis. 
Interestingly, the structure suggests that P2 Cox ohgomer- 
izes as a heUcal filament, and in that state binds and winds 
up DNA around the helical filament. Genetic and func- 
tional studies indicate the importance of the functional 
residues, and we propose that the C-terminal domain of 
P2 Cox is required for its ohgomerization and function. 
Our study provides insight into the structure and function 
of the Cox proteins and the unexpected properties of the 
C-terminal region of these proteins. 



MATERIALS AND METHODS 

Site-directed mutagenesis 

All constructs for the alanine screening were performed 
using the QuikChange™ Site-Directed Mutagenesis Kit 
or the QuikChange Lightning Multi Site-Directed 
Mutagenesis Kit, Stratagene, USA. The primers used for 
mutagenesis were obtained from either Eurofins MWG 
Operon, Germany, Thermo Fischer Scientific, Germany 
or DNA technology A/S, Denmark. All constructs were 
verified by DNA sequencing by Eurofins MWG Operon 
and Macrogen Inc., Korea. The primers used are listed in 
Supplementary Table SI. 



Protein purification 

P2 Cox protein was purified from E. coli BL21(DE3) 
pLysE (25) containing plasmid pEE720 (17). The 
bacteria were grown in M9 minimal medium in a LEX 
bubbhng system (SGC, Toronto, Canada) at 37°C until 
ODgoo 0.6 when 30 ml/1 of amino acid mixture [100 mg 
each of lysine, threonine, phenylalanine, 50 mg each of 
leucine, isolecine, valine and L(+)-selenomethionine] was 
added. After another 20min, isopropyl-(3-D-thiogalacto- 
pyranoside (IPTG) was added to a final concentration of 
0.5 mM. The culture was allowed to grow overnight at 
22°C until harvested by centrifugation. The pellet was 
resuspended in Buffer C (0.3 M potassium phosphate 
buffer, pH 7.5, 3mM EDTA, 0.5 M KCl, 0.1% Triton, 
0.5 mM TCEP) at 6 ml per gram of cells and freeze 
thawed twice to allow leakage of lysozyme and partial 
lysis. To complete lysis, resuspended cells were sonicated 
on ice in four 30 s bursts at 12-14 ^m with an MSE 
Soniprep 150. The extract was clarified by centrifugation 
in a Sorvall RC5C at 23 000g for Ih at 4°C, and 
ammonium sulfate was added to 25% saturation to the 
supernatant. After being gently stirred at 4°C for 1 h, the 
mixture was centrifuged at 17 000^ for 30min, and the 
pellet was subsequently resuspended in Buffer I (20 mM 
Tris-HCl, pH 7.5, 150mM NaCl, 15% isopropanol, 
0.5 mM TCEP) followed by filtration through 0.45 
and 0.22 |im filters. The extract was loaded on a 
HiLoad 16/60 Superdex200 column (GE Healthcare, 
Sweden). The Cox-containing fractions were analyzed 
by 20% homogenous SDS-PAGE gels using the 
PhastSystem (GE Healthcare) and concentrated using 
Vivaspin Centrifugal Concentrator (Vivaproducts, MA, 
USA). When purifying the non-labeled Cox protein an 
additional anion-exchange purification step was per- 
formed after the ammonium sulfate precipitation using 
HiTrap DEAE columns (GE Healthcare). Proteins were 
eluted at 2M KCl followed by desalting on PD-10 
columns (GE Healthcare) pre-equihbrated with Buffer 
II (20 mM Tris-HCl, pH 7.5, 150mM NaCl) before 
gel filtration. Both purification procedures resulted in a 
very low protein yield (<1%), probably reflecting 
the tendency of P2 Cox to form higher order complexes 
in solution, which were subsequently lost in the void 
volume on the size exclusion chromatography column. 
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Crystallization and structure determination 

Crystallization was performed at 5.5nig/nil of Cox using 
sitting drop vapor diffusion, using a reservoir solution con- 
taining 0.2 M Na-citrate pH 5.0, 24% PEG 4000 and 20% 
isopropanol, in a 1:1 drop ratio. Diffraction quality crystals 
grew at 18°C and appeared after 4-12 weeks. Crystals were 
frozen in liquid nitrogen without any further cryo-protec- 
tion. Native diffraction data were collected at beamline 
ID14-2 at ESRF, France. The structure was solved using 
single-wavelength anomalous diffraction (SAD) collected at 
beamline PXl at the SLS, Switzerland. Data processing 
and reduction were performed using XDS and programs 
from the CCP4 suite (26,27). Relevant statistics for phasing 
and refinement can be found in Table 1 . Weak initial phase 
information was found using Phenix AutoSOL (28), which 
gave an initial model of 43 residues. Only one of the five 
Selenium atoms per monomer was located, with an ct level 
of 4. Numerous rounds of manual model building in Coot 
(29), and subsequent refinement in RefmacS and 
Phenix. refine (28,30), were needed to complete the model. 
The native dataset diffracted to 2.4 A, although had sub- 
optimal data completeness (92.9% overall). Unfortunately 
it was not isomorphous to the complete 2.9 A SeMet 
dataset. The final model was therefore compared to the 
complete SeMet dataset. No differences were found, and 
since the electron density of the high resolution dataset was 
substantially better that one was used for the final round of 
model refinement. The final model was verified using 
Molprobity (31), and got a Molprobity score residing in 
the 97th percentile. All residues were within the 
Ramachandran allowed regions. The i?work and R^^^^ of 
the final model were 20.9% and 25.3%, respectively. All 
structure figures were prepared using PyMOL. 

Complementation assay 

Strain C-6005 was transformed with pEE720 derivatives 
containing the wild-type cox gene (17) or various alanine 



Table 1. Data collection and refinement statistics 



Native 



SeMet 



P6s 



Data collection 
Space group 
Cell dimensions 
a, h, c (A) 

a, P, Y n , 
Resolution (A) 

^mcas (%) 

//a (I) 

Completeness (%) 
Redundancy 
Refinement 

Resolution 33.1-2.4 
Number of unique reflections 3097 

«work/«free 20.9/25.3 

Number of atoms 

Protein 1266 
B-factors 

Protein 90.6 
R.m.s. deviations _ 

Bond lengths (A) 0.003 

Bond angles (°) 0.733 



P6, 



66.3, 66.3, 33.8 64.8, 64.8, 33.9 

90.0, 90.0, 120.0 90.0, 90.0, 120.0 
33.1-2.4 (2.48-2.40) 56.1-2.8 (2.9-2.8) 

8.3(54.1) 11.8(78.2) 

17.3 (2.4) 5.0 (1.3) 

92.9 (86.1) 99.2 (94.9) 

6.0 10.0 



Values in parenthesis are for the highest-resolution shell. 



substitutions. Strain C-6005 (8) is a C-1 a derivative [F~ 
prototrophic E. coli C strain (32)], lysogenized by the cox 
defective P2 mutant cox3. The transformants were grown 
at 30°C overnight in LB, supplemented with ampicillin 
100|ig/ml for the transformed cells and with addition of 
0.2 M potassium phosphate buffer pH 6.8 to avoid 
re-adsorption of released phages. The bacterial (CFU/ 
ml) and phage titers (PFU/ml) were determined after 
dilution in triplicates. C-1 757 [polyauxotrophic E. coli C 
strain strp siipD (33)] was used as an indicator strain for 
phage titrations. 

Expression and solubility of P2 Cox variants 

pEE720 derivatives containing the wild-type cox gene or 
various alanine substitutions were transformed into 
BL21(DE3) pLysE. The bacteria were grown in 35 ml of 
LB medium at 37°C until ODeoo 0.6 when IPTG was 
added to a final concentration of 0.5 mM. The culture 
was allowed to grow another 3h at 37°C until harvested 
by centrifugation. To monitor the expression of the differ- 
ent Cox variants a small aliquot was collected before 
centrifugation and loaded onto 20% homogenous SDS- 
PAGE gels using the PhastSystem (GE Healthcare). The 
pellet was resuspended in Buffer CI (0.3 M potassium 
phosphate buffer, pH 7.5, 3mM EDTA, 0.5 M KCl, 
0.1% Triton) at 6 ml per gram of cells and freeze thawed 
twice to allow leakage of lysozyme and partial lysis. To 
complete lysis, resuspended cells were sonicated on ice in 
two 15s bursts at 12-14|im with an MSE Soniprep 150. 
To decrease the viscosity of the lysate 10 units of DNasel 
(Fermentas) was added and incubated for 5min at 37°C. 
The extract was clarified by centrifugation in a Sorvall 
RC5C at 16000^ for 2h at 4°C. Total protein concentra- 
tion of the lysates was measured in a Nanodrop (Thermo 
Scientific). To monitor the solubility of the different Cox 
variants, 75 [ig of total protein from the cleared lysate was 
loaded onto 17% ClearPAGE SDS gels (C.B.S. Scientific). 

Data analysis 

An alignment of 20 cox-boxes (34) was used as input into 
WebLogo (35) to generate a sequence logo of the cox- 
boxes. All structural figures were prepared using PyMol. 



RESULTS 

Structure of P2 Cox 

Crystallization experiments yielded crystals of P2 Cox in 
space group P65, diffracting to 2.4 A (Table 1). The struc- 
ture was solved with the SAD technique, using seleniuni- 
methionine incorporated Cox. The asymmetric unit con- 
tained one molecule of P2 Cox, and the final model com- 
prises of residues 7-87. The first 6, the last 4 and the loop 
residues G50-R51 were (partially) disordered and not 
visible in the resulting electron density. The final R^orkl 
Rbse values were 20.9/25.3%. The overall fold of P2 Cox 
shows a winged HTH structure consisting of a three- 
stranded antiparallel beta-sheet with a 12 residue loop 
packed against the HTH motif and the C-terminal 
domain containing two a-helices. The secondary structure 
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elements are arranged in a (31 -al-T-a2-p2-L-p3 -a3-a4 
order, with T and L being the turn and loop structures, 
respectively (Figure la). 

The overall fold is similar to the previously determined 
^'^"Xis (20,12,13) (PDB code: 2IEF), with an r.m.s.d. 
of 1.0 A over 40 Cot atoms between the two structures. 
The major difference is the loop between P2 and P3 in 
the wing, which in P2 Cox is much longer with 12 
residues versus 3 in ^Xis. In the ^'^^^Xis-DNA complex 
structure this loop binds to the minor groove of DNA. 
In contrast to the ^•'^^^Xis protein, the full-length P2 
Cox protein was crystallized, thereby reveaUng the struc- 
ture of the C-terminal domain with helices a3 and a4 
(Figure lb). 

Structural similarity is also found with the Xis protein 
of the conjugative transposon Tn916, ^"^'*Xis (PDB code: 
1Y6U), resembhng the fold of Cox with an overall r.m.s.d. 
of ~2A, and with a sequence similarity of 11% (22). 
Similar to ^Xis, the loop of the wing in "^'^Xis is much 
shorter compared to the one of Cox. The ™'^Xis protein 
is thought to enhance excision in a similar manner to the 
"^Xis protein, although clear mechanistic differences 
between the two have been suggested (23). 

P2 Cox oligomerization 

It has previously been shown that oHgonierization of P2 
Cox is required for its biological activity (17). By 
analyzing the crystal packing we can show that extensive 
contacts are formed between the P2 Cox monomers in the 
crystal. Each monomer interacts with three other 
monomers to form a helix shaped packing in the crystal 
(Figure Ic). The helical filament has six monomers per 
turn, a diameter of 65 A and a rise of 35 A per turn, and 
is left-handed. The intermolecular interactions bury a 
large surface area; monomer n has a buried surface area 
of 1040 A"^ to molecule «+l, 1050 A^ to molecule n - 1 
and 245 A^ to molecule n — 2. Since the total surface 
area of each monomer is ~4300A^, 55% is involved in 
intermolecular contacts. This is a very high percentage, 
normal values for biological oligomers are between 25% 
and 40%) (21,36), strongly indicating that the packing is 
not an artifact from the crystallization but indeed bio- 
logically relevant. 

The two C-terminal a-helices of P2 Cox contribute 
extensively to the helical packing. Hehx 4 contains two 
tryptophans, W81 and W84, which both promote oHgo- 
merization. W81 forms a hydrophobic core together with 
several hydrophobic residues in a3 in the n — 1 chain, as 
well as TT-stacking to P38 of the same neighboring chain 
(Figure Id and e). W84 has hydrophobic interactions with 
K34 and K36 on n— 1, as well as with the D44 on n — 2 
(Figure Id and e). Numerous hydrogen bonds and hydro- 
phobic interactions along the interfaces also contribute to 
the packing of the proteins into the spiral shaped crystal 
arrangement. Interestingly, the DNA-binding motifs of P2 
Cox, i.e. the wing binding to the minor groove and a-helix 
2 binding to the major groove, do not significantly par- 
ticipate in the crystal packing, but are rather pointing 
outwards from the spiral (Figure Ic). 



Engineering of the putative interfaces of P2 Cox 

Our P2 Cox structure indicates that a-helix 2 is the DNA- 
binding helix, similar to the homologs ^Xis and ^""^Xis. 
To further investigate the proposed DNA-recognition 
motif, amino acids thought to be involved in DNA 
binding were substituted for alanines, one at a time, by 
site-directed mutagenesis of plasmid pEE720 that contains 
the wild-type cox gene (Figure 2a). To test the biological 
activity of the mutations, the plasmids were transformed 
into strain C-6005, which is a lysogen containing the 
V2cox3 mutated prophage (10). This P2 lysogen is defect- 
ive in spontaneous phage production during growth (8). 
Upon transformation with plasmid expressing the wild- 
type P2 Cox protein, the spontaneous phage production 
increases and the number of plaques per colony forming 
bacteria (PFU/CFU) is about four orders of magnitude 
higher compared to the defective P2 lysogen. As shown in 
Figure 2b and Supplementary Table S2, most mutants 
give almost no release of phages at all, and thus are 
unable to complement the cox3 defective prophage. The 
T25A, R30A, D33A and K75A mutants complement the 
defective prophage to some extent, although substantially 
less than wild-type P2 Cox. The loop of the wing structure 
of Cox is expected to contact the minor groove of DNA 
(Figure lb). Therefore, R51 was also separately mutated 
into alanine. The substitution completely abolishes the 
biological activity in this system (Figure 2b), supporting 
the importance of the wing in DNA binding. Since the 
structure indicates that the C-terminal is involved in ohgo- 
merization and possible DNA interaction, amino acids 
W81 and W84, as well as R73, K75 and R78 were also 
substituted by alanine residues. Figure 2b shows that the 
mutants R73A R78A, W81A, W84A are all unable to 
complement the defective lysogen as opposed to K75A, 
which allows some spontaneous phage production. 
Finally, to confirm the importance of the last helix, a 
stop codon was inserted at position 70, and this eliminates 
P2 Cox protein activity. To ensure the different Cox 
variants do not greatly influence protein expression or 
folding, we overexpressed all variants and assayed their 
solubility. It was not feasible to do this in the complemen- 
tation assay, as this assay relies on T7 promotor leakage, 
resulting in very low expression levels. All P2 Cox variants 
express, and all but two of the P2 Cox variants (D33A and 
P38A) express as soluble proteins as detected by SDS- 
PAGE on clarified cell extracts (Supplementary Figure 
SI). The D33A variant partially complements the defect- 
ive prophage even though no soluble protein is detected. 
Likely there is still some soluble protein present, but at 
such levels that we do not see them by SDS-PAGE. In the 
case of P38A we cannot rule out that its lack of activity 
does not stem from its decreased solubihty. 

By comparing our P2 Cox structure with the ^^^^Xis- 
DNA structure, we can couple mutations, both from the 
alanine scanning performed in this work as well as previ- 
ously isolated mutants, to the presumed function of the 
amino acid substituted. An alanine exchange can in prin- 
ciple affect three things in P2 Cox: the DNA binding, the 
protein packing, the ohgomerization or a combination 
thereof. Of the mutants shown in Figure 2b, nine are 
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Figure 1. Structure of P2 Cox. (a) Cartoon representation, colored blue to red from the N- to the C-terminal. All secondary structure elem- 
ents are labeled in the figure. The DNA-binding helix and wing are both marked with an arrow, (b) Superimposition of P2 Cox (magenta) 
and the DNA bound structure of '^'^^^Xis (green, PDB code: 2OG0). The structural fold of the DNA interacting elements is conserved, with 
the exception of the much longer loop, interacting with the minor groove, in P2 Cox. (c) Visualization of the crystal packing of P2 Cox, 
from the side (left panel) and from the top (right panel). Cox packs into a left handed spiral, with a diameter of 65 A and a pitch of 35 A. Each 
protomer in the helical filament is colored individually, (d and e) Stereo figures of close-up views on the major oligomerization areas, with important 
residues shown as sticks. The approximate location of the close-up in respect to the helical filament is shown by the gray (d) and magenta 
(e) rectangles in (c) (~90° rotated). Four protomers in the helical filament are shown, n+\, «, «— 1 and n — 2 in green, magenta, yellow and 
blue, respectively. 
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Figure 2. Functionally important residues in P2 Cox. (a) Sequence alignment of P2 Cox and its closest sequence and structural homologs. Single 
alanine substitutions of P2 Cox that have been examined within our work are in bold font. Residues marked with an x above are altered in some 
previously identified cox mutants, (b) Level of spontaneous phage production, measured via number of plaque forming units per colony forming unit. 
C-6005 is a cox deficient strain. All single alanine mutations were made on the previously designed pEE720 plasmid. All data have been normalized 
against the pEE720 wild type and are shown on a logarithmic scale. Error bars show the standard deviation of the normalized values over at least 
three independent measurements. For the raw data, please see Supplementary Table S2. (c) The positions of the residues that were exchanged into 
alanine in this study are shown as purple sticks in the cartoon representation of P2 Cox, colored blue to red from the N- to the C-terminal. R51 is 
located in a disordered loop, and its location is therefore only approximate. 
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suggested to directly affect the DNA binding (T25A, 
R29A, R30A, D33A, R51A, R73A, K75A, R78A and 
A70) to various degrees. Four of the alanine substitutions 
are suggested to disturb the protein packing: (i) V28A 
destroys the hydrophobic packing with a-helix 1, (ii) 
M31A wiU interfere with the hydrophobic packing of 
V28, L37 and K36. (iii) 132A could also lead to a 
reduced stability due to a decrease of hydrophobic inter- 
actions with Y14, L37, V39 and Y55. (iv) P38A could 
weaken the stability and rigidity provided to the 
backbone by the proline, possibly destabilizing the 
region preceding the wing motif and thereby also disturb- 
ing the DNA-binding capacity. Alternatively the effect 
could be due to decreased solubihty of the P38A variant 
(Supplementary Figure SI). Finally, we identified five 
mutants of P2 Cox that likely affect the ohgomerization, 
by interfering with either hydrophobic packing or stacking 
effects to other protomers within the oligomer. These are 
M31A (with « - 1 chain), P38A (with n+l chain), W81A 
(with n - 1 chain), W84A (with n - 1 chain) and A70 
(with n- 1 chain). W81A and W84A are located on the 
extended C-terminal, far from interactions with the 
winged HTH motif in the monomer. The exposed 
position of these large hydrophobic amino acids, usually 
found in the hydrophobic core of proteins, prompted us to 
investigate their importance by mutagenesis. The structure 
suggests that W8 1 and W84 are involved in P2 Cox ohgo- 
merization. The loss of function of the W81A and W84A 
variants point to the importance of these residues for 
protein function. Positions of the substituted residues in 
P2 Cox are shown in Figure 2c. 

Several P2 Cox mutants have previously been isolated 
either as prophages unable to produce phages spontan- 
eously, i.e. with impaired excision activity, or as P2 
mutants unable to induce DNA replication of prophage 
P4, i.e. unable to act as a transcriptional activators 
(6,8,10,16,24). These P2 Cox mutations are shown in the 
sequence alignment in Figure 2a. The P2 Cox mutants 
identified as prophages unable to spontaneously form 
free phages during growth, i.e. cox2 (M42T), cox3 
(G22E) and cox4 (E54K), all have substitutions that 
probably disturb the protein packing as well as the ohgo- 
merization of the protein complex. Introduction of a 
charged glutamate in the place of a small glycine in a 
tight turn after tx-helix 1, as in cox3, could impair this 
structural motif The E54K exchange in cox4 produces a 
charge change on the amino acid side chain that hkely 
extinguish charge interactions with the adjacent polypep- 
tide chain in the oligomer. Among the cox mutants 
isolated as unable to induce prophage P4, coxlOJ 
(R29H) suggestively affects the DNA-binding capacity. 
The A69T substitution in coxl29 introduces a polar 
residue in a hydrophobic patch including both chain 
n + 1 and « — 1 , thereby potentially influencing the ohgo- 
merization behavior of P2 Cox. Coxl30 (E16K) does not 
substantially interfere with any interactions, neither within 
the P2 Cox packing nor in the ohgomerization. This cor- 
responds well with previous data which showed that this 
substitution does not influence P2 Cox activity in a sub- 
stantial way (17). 



DISCUSSION 

The Cox proteins constitute a unique family of direction- 
ahty factors, coupling the control of the transcriptional 
switch with the integration/excision process (18,14,11). 
Phylogenetically P2 Cox belongs to a sub-group of this 
family, which has the predicted winged HTH domain at 
their N-terminal halves of the protein (37). 

The crystal structure of P2 Cox presented in this article 
reveals that it shares the winged HTH with its homologs. 
Importantly, P2 Cox was crystallized as a full-length 
protein, in comparison to its homolog ^Xis in which 
only the winged HTH domain was crystallized (^^^Xis). 
The structure reveals that P2 Cox has a defined structure 
of its C-terminal half, giving rise to both ohgomerization 
and DNA-binding possibilities. Structural comparison of 
P2 Cox with the ^^^Xis-DNA structure aUows us to 
identify amino acids, in the winged HTH, that are pos- 
itioned so that they can make either specific or non- 
specific contacts with the target DNA in the cox boxes. 
To confirm the biological importance of these residues, 
they were independently substituted by alanines, and 
checked for biological activity via a phage production 
assay. Most substitutions effectively inactivate P2 Cox. 
However, five mutants (T25A, G26A, R30A, D33A and 
K75A) are able to complement the cox3 defective 
prophage, albeit at highly decreased levels, indicating 
that the exchange of these residues alone is not enough 
to completely disturb the protein function. 

By analyzing previous biochemical data in the hght of 
our new structural information, we can now propose 
which residues are important for DNA recognition. Two 
close homologs of P2 Cox have previously been studied, 
the Cox proteins from the phages P2 Hy dis, and W<I> 
(18,38,39). Hy dis Cox can complement a P2 cox3 defect- 
ive prophage, whereas WO Cox cannot. Sequence com- 
parison, together with the present P2 Cox structure, 
indicates a couple of residues hkely to be involved in the 
specific DNA base recognition such as T25 and D33. Both 
the T25A and the D33A mutants complement the defect- 
ive prophage to some extent, although substantially less 
than wild-type Cox. Hy dis Cox, which can complement 
cox3, has a glutamate at position 33 while W$ Cox has a 
lysine. This indicates that the negative charge in P2 Cox 
and Hy dis is important for DNA recognition. T25 is a 
threonine in both P2 and Hy dis Cox, whereas it is a 
prohne in W$ Cox. For other residues likely to be 
involved in DNA binding, no such dissimilarities can be 
observed between the three Cox proteins, thereby suggest- 
ing that T25 and D33 are important for DNA specificity. 

Arginines are often found in interactions with DNA 
(40). We therefore also studied two arginines positioned 
in a-helix 2 (R29 and R30), as well as the wefl-conserved 
arginines found in the C-terminal close to a-helices 3 and 4 
of P2 Cox (R51, R73 and R78). The alanine scanning 
showed that aU but the R30A substitutions entirely 
abolish the biological function of the protein 
(Figure 2b). The three latter residues are all conserved 
between the Cox proteins discussed here, indicating that 
they are functionally important. R5 1 is located in the wing 
domain, which likely binds to the minor groove of the 
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n+1 



(d) 




GATTGTTTAGTGCTTGGATGTGGGCACTflAAAGGCAT TflTA AGACATTAfiACGCfi ATTCAT GAGGGCTAGRGGACG 

-10 -35 



GTCACCAGTAGGGGCTTTCAACGGTACAATGCGGGTTTGAGCGGCATAAATTACCACTGAAAGCCCTTAAACGTTAC 

4/9 . 



GGGTAGCGCTTTATTTTGTGAATATTTTCAGCAGACGCAACAGGGGGGATTTGTTCAGGCTGTCTTACAATGGCTG 



TGTGTTTTTTGTTCATCTCCACTTAAAGTCATTTAAAGCCACTTAAAGCAATTTGTAATTTTTATAGTGAAATACAAATC 



Consensus from 20 P2 cox boxes T T A A A (G/C) N C (A/C) 

Figure 3. P2 Cox oligomerization and DNA binding, (a) Model of liow P2 Cox can bind and wrap DNA around its oligomeric state, as seen from 
the top (left panel) and the side (right panel). The P2 Cox spiral has a similar diameter as DNA wrapping around nucleosome core particles, and the 
modeled DNA is taken from a DNA bound nucleosome structure (PDB code: 3AV1). (b) Stereo-view of a zoom in three of the protomers («+ 1, n 
and n— 1) in the filament. Coloring scheine is the same as in Figure 1. To simplify the visualization of the DNA binding, only part of the modeled 
DNA from (a) is shown. The proposed DNA interacting elements of P2 Cox, the helix and wing, align well with the major and minor groove of the 
DNA, respectively, along the filament chain, (c) Stereo-view, rotated 45° around the .v-axis when compared with (b). For clarity we are here only 

(continued) 
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DNA (Figures lb and 2c). R73 and R78 do not partake in 
any major interactions in the helical filament, rather they 
are pointing outwards in the helical filament. In a DNA- 
binding model, those residues of the n — 5 protomer could 
interact with the backbone of the same stretch of DNA 
that is bound via the helix and wing of the n protomer 
(and n — 6 with the DNA bound to « — 1, etc) (Figure 3a- 
c). The importance of a-helices 3 and 4 is further 
supported by the fact that a deletion of the last 20 
amino acids destroys the biological activity of P2 Cox. 

An intriguing finding is the relatively large variation in 
spontaneous phage production in the complementation 
assay with some of the alanine substitutions 
(Supplementary Table S2). Since the experiments are 
standardized, and the controls do not show such large 
variations, it is most Hkely a result of an unknown 
factor affecting the transcriptional switch during growth. 
For example, the combination of the Cox3 protein ex- 
pressed from the prophage and the alanine substituted 
Cox proteins expressed from the plasmids may form 
hetero-oligomeric structures with unknown functionality. 
Those could affect the DNA binding and consequently the 
downstream reactions in a stochastic way that give differ- 
ent outcomes in an overnight culture. However, the clari- 
fication of this is beyond the scope of this work. 

In gel-filtration and cross-hnking experiments Cox has 
been show to form tetramers and octamers in the absence 
of DNA (17). This indicates that Cox might initially asso- 
ciate to its DNA targets as a multimer in vivo. Gel shift 
analysis have been performed (17,41), and with increasing 
Cox concentration slower migrating DNA fragments are 
seen in the gels, indicating formation of larger protein 
complexes, but the number of Cox molecules per DNA 
fragment has never been determined. Thus, larger protein 
complexes are necessary for biological activity. Although 
we here show P2 Cox filament formation in vitro also in 
the absence of DNA, which happens at relatively high 
protein concentrations, we suggest that DNA binding 
may initiate and faciHtate filament formation. By 
analysis of the spiral shaped crystal packing together 
with the biochemical data presented here we can show 
that the extensive contacts formed between the P2 Cox 
monomers in the crystal are likely to be important for 
the biological function. The contacts bury 55% of the 
surface area of the protein, strongly indicating that 
the helix-shaped crystal packing is not a crystallization 
artifact. Intriguingly, the diameter and pitch of the 
formed P2 Cox spiral is virtually the same to how DNA 
wraps around nucleosome core particles, which also have 
a diameter and pitch of 65 ± 5 and 30 ± 5 A, respectively, 
as measured on different nucleosomes (PDB codes 3 AVI, 
lAOI, lEQZ, 1F66, IMIA and 2CV5). Furthermore, the 



P2 Cox helical filament is left-handed, as are nucleosomes 
(42). Although not a direct proof, it is very tempting to 
speculate that P2 Cox in a similar manner binds to its 
target DNA, by wrapping the DNA around its own 
spiral (Figure 3a). Previous biochemical data also 
supports this theory. It has been shown that P2 Cox 
upon binding induces DNA bending, but only if there 
are more than four cox-boxes (18). Furthermore, DNA 
footprinting experiments have shown that DNA bound 
to P2 Cox is sensitive to DNase at specific nucleotide dis- 
tances, indicating DNA wrapping (8,9,18,23). Finally, the 
DNA-binding elements of P2 Cox are^ located favorably 
on the outside of the spiral, with ~27 A distance between 
the Ca atom of R30 and the Ca atom of R30 of any 
neighboring monomer in the filament, consistent with 
the distance between the major grooves of bent DNA 
(Figure 3b). This allows P2 Cox to interact with the 
DNA wrapped around the P2 Cox spiral (Figure 3a and 
b). The cox-boxes are distributed with 10-11 bases 
between the start of each site, consistent with the 
modeled P2 Cox-DNA complex, with one protomer 
binding to the major groove with each turn of the DNA 
heUx (Figure 3 b). 

These results now allow us to add a molecular under- 
standing to how P2 Cox functions in its three mechanis- 
tically very different functions: (i) repression of the P2 Pc 
promoter, (ii) activation of the P4 satellite phage and (iii) 
as an activator of excision. P2 Cox has been shown to 
protect a DNA region of at least 70 base pairs containing 
the different cox-boxes. At high P2 Cox concentrations 
the protected regions are extended around these regions 
(8,1 1,18). As can be seen in Figure 3d, the cox-boxes have 
different orientations in the different DNA target regions. 
We propose that the cox-boxes act as initiator sites for 
filament formation, allowing P2 Cox to bind and wrap 
the DNA around itself. We also propose that filament 
formation only occurs in the direction of the orientation 
of the cox-boxes. First, the P2 Pc promoter region is 
overlapped by the cox-boxes thereby allowing P2 Cox to 
bind and wrap the DNA effectively stopping transcription 
by blocking the —10 and —35 regions. Furthermore, at 
higher P2 Cox concentrations, footprint experiments 
have shown that the filament also starts to grow in the 
direction of the first cox-box toward the opposite facing 
Pe promoter, located ~60 base pairs upstream of Pc, thus 
also repressing the Pe promoter. This notion is also sup- 
ported by the fact that P2 Cox is autoregulating (5). 
Second, in the Pll promoter of the satelHte phage P4 
the first cox-box is located downstream and facing away 
from the —10 and —35 regions, hence leaving the 
promoter region accessible for transcription upon P2 
Cox DNA binding and wrapping. Previous footprint 



Figure 3. Continued 

showing the n and the ii — S protomer. The n — 5 protomer is located underneath the protomer, almost a full turn of the filament. R73, K75 and 
R78 from the n — 5 protomer are highlighted, as red spheres, showing the possibility for them to interact with the DNA backbone, (d) The sequence 
logo of a cox-box is shown, taken from 20 different cox-boxes (34). The cox-box consensus sequence is T T A A A (G/C) N C (A/C). The DNA 
sequences of the P2 Cox target DNA in phage P2 and satellite phage P4, as well as the sequence logo of the cox-boxes, are shown. The locations of 
the cox-boxes are indicated in gray, with the orientation of the cox-boxes and their identity to the consensus sequence indicated above the respective 
cox-box. Note that the cox-boxes with reverse orientation have their similarity to the sequence logo in its complementary reading frame. The location 
of the —10 and —35 regions in P2 Pc and in P4 Pll are indicated (6,16,43). 
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analysis supports this since the cleavage pattern is un- 
changed over the Pll promoter with increased protein 
concentration (16). This would indicate that filament for- 
mation only occurs in one direction at Pll- Third, at the 
attP region the six identified cox-boxes are facing two dif- 
ferent ways, thus indicating filament formation in both 
directions, also supported by footprint analysis (11). By 
growing toward the core and arm binding sites of P2 
integrase, P2 Cox filament formation could have implica- 
tions on intasome formation and consequently phage in- 
tegration. The inhibitory effect of P2 Cox on integration is 
well known (8). Here, cooperative binding of P2 integrase 
and P2 Cox to attP probably reduce the amount of Cox 
needed in vivo to achieve this inhibitory effect. By a similar 
filament formation mechanism in attL, P2 Cox probably 
also determines the nature of the excisonie in phage 
excision. 

Previous data suggests that the concentration of P2 Cox 
determines to what extent the DNA is protected upon P2 
Cox binding (11,16). Since expression of P2 Cox leads to 
repression of the P2 Pc promoter, excision activity, as well 
as activation of the P4 satellite prophage, it is clearly of 
great importance for the organism to tightly control the 
P2 Cox expression levels. However, further studies are 
needed to clarify the exact mechanism of P2 Cox 
regulation. 

To summarize, we have here presented the crystal struc- 
ture of full-length P2 Cox. We have through genetic and 
functional studies indicated the importance of key 
residues, suggested by our structural results, hkely to be 
involved in ohgomerization and DNA binding. 
Furthermore, the C-terminal region was identified as re- 
sponsible for packing into helical filaments. The helical 
ohgomerization revealed by the crystal structure of P2 
Cox has given us the basis to interpret previous biochem- 
ical results at a structural level. It is the DNA binding, in 
particular the DNA wrapping around the P2 Cox helical 
ohgomer, which in a direct way either blocks or unblocks 
the promoter regions around the different cox binding 
sites, thereby acting as either a repressor or activator. 
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